82 - NHR PerfLab Seminar 2025-02-11: SIRIUS—a GPU-accelerated electronic-structure DFT library [ID:56355]

50 von 668 angezeigt

I'm going to talk about this project.

This is hopefully going to be a more thorough version of what I talked about at SC 24 because

I think I had like 20 minutes there and it was a little bit tight.

So I'll tell you about this project, which predates me and I'll explain the history and

the contributors and how we've used it.

So I started working on this project when I was at Intel and I was there for almost

seven years and we did a lot of interesting co-design things, lots of software co-design,

which is easy to talk about.

We did do some hardware stuff, which I can sort of talk about, but I'll explain the uses

of this code.

I'm fairly proud of this.

Tim Matson retired, so I'm the curator and maintainer and developer at this point.

But I think there's a lot of neat stuff in here and I think it's a really useful tool

for computer scientists.

I think it's at this point somewhat unique and I'll talk about the landscape for that.

Yeah, so Tim Matson created this sometime starting in the 1990s and Tim was motivated

to understand computer architecture.

And of course, in the 1990s, parallel computing wasn't exactly the thing it is today.

Nowadays everything is SIMD, everything is multi-core.

But back in the day, things were a little bit simpler.

And so some of the kernels were really a sequential thing.

There's a branch prediction kernel, which I think there are some interesting generalizations

of that, but that just shows you sort of how the scope was trying to understand architecture

very generally.

And then a couple of years before I got to Intel, Ravinder Wingart, who's now my Nvidia

colleague, worked on developing the initial parallel implementations with MPI and OpenMP

as they existed in the good old days, CPU only simple stuff, but focused on implementing

the algorithms and designing the parallel scheme, which is something I built upon.

And so when I joined, we were interested in exascale and programming models in general.

And so we did a lot of study, which then spawned some new ideas, and then I sort of

did some things I'll talk about.

So a lot of people have contributed.

Obviously, a bunch of people from Intel contributed because we try very hard to be lazy.

One of the things we're trying to do with this project is have it be a fair study from

the standpoint of expertise, which means Tim knows a lot about OpenMP, I know a lot about

And if we wrote MPI and OpenMP and then we also wrote all the code in UPC or Charm++,

then that'd be an unfair comparison.

So we enlisted a lot of these folks like Jacob Nelson at the University of Washington and

others to help us, or we wrote code and then we sent it to folks and said, tell us what

we did wrong.

And I've been really fortunate.

There's some folks, I don't know if Karsten Bauer is on the call or if he graduated, but

I believe he contributed some really good Rust stuff.

If Rust or Julia, I'm sorry, I can't remember all the stuff now, but some really good stuff.

And there's been a lot of contributions lately, especially on Rust and Julia, that we could

have done.

So thanks everybody who contributed.

And this should tell you, we are a pretty open project.

If you're interested in playing around, it's both easy to contribute and we make it easy

for people who are interested.

Teil einer Videoserie :

HPC4FAU / NHR@FAU

Teil eines Kapitels:

NHR@FAU PerfLab Seminar

Presenters

Dr. Georg Hager

Zugänglich über

Offener Zugang

Dauer

00:59:26 Min

Aufnahmedatum

2025-03-03

Hochgeladen am

2025-03-03 11:36:04

Sprache

en-US

NHR PerfLab seminar talk on February 25, 2025

Speaker: Jeff Hammond, NVIDIA

Title: Hardware-software co-design with the Parallel Research Kernels

Slides: https://hpc.fau.de/files/2025/03/PRK_Jeff_Hammond_NHR.pdf

Abstract:

The Parallel Research Kernels (PRK) were created to be the simple yet still interesting implementations of fundamental algorithms in high-performance computing, which could be used to evaluate and improve hardware and software systems. In this talk, I will describe the design methodology of the PRK and their use in multiple contexts. First, we consider the viability of alternative distributed programming models as compared to multiple flavors of MPI, especially the sensitivity to message granularity. Second, we demonstrate the use of the PRK to evaluate programming languages, from Python and C++17 to Rust and Julia. Finally, we use the PRK to measure the behavior of accelerators and heterogeneous memory systems. The PRK were created by Tim Mattson and Rob Van der Wijngaart; this talk is based on the collective efforts of more than a dozen contributors.

For a list of past and upcoming NHR PerfLab seminar events, see: https://hpc.fau.de/research/nhr-perflab-seminar-series/

Einbetten

Wordpress FAU Plugin

 https://www.fau.tv/clip/id/56355

iFrame

<iframe src="https://api.video.uni-erlangen.de/services/oembed/?url=https://www.fau.tv/clip/id/56355&format=iframe&maxwidth=1280&maxheight=720" width="1280" height="720"seamless allowfullscreen style="border: 0; padding: 0; margin: 0;overflow: hidden;"></iframe>

Herunterladen

Video

Audio

Per RSS abonnieren